-
Notifications
You must be signed in to change notification settings - Fork 44.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use selenium for web browsing #121
Conversation
Agusx1211
commented
Apr 3, 2023
- Allows for browsing dynamic pages (Twitter, Google, .etc).
- It lets the agent read the whole page (paginated view)
-
- Resuming can be added back later, but I think this performs better
- I added "No images" to the base prompt, otherwise it tries to fetch images.
- Selenium can later be used to further interact with the page (click buttons, fill forms).
The AI appears to like paging and reading more detailed websites, however, I think there should be an option to obtain a get_text_summary of the whole webpage via prompt. |
@Agusx1211 Could you please resubmit this as an optional (using an argument) rather than replacing the existing code? There have been some good improvements to browsing since this was submitted, but we're still looking for a major solution. |
I can't work on this until late night. What do you mean as an optional? like adding a flag that changes the behavior? |
Yes, looks like he meant this. Cause it's major improvement which needs separate flag |
See also PR #96 which uses Playwright instead of requests. But I also second the request that this is triggered by an argument. |
I think this should be an alternate kind of command that the AI can run if the default (static) beautifulsoup parser fails. |
An important thing is that Selenium makes it harder to setup. And it somehow assumes that Chrome is the only Selenium web driver. So, for it to merge I guess (I'm not the maintainer) it should be an entirely optional feature with a separate document describing the installation, keeping the current "for dummies" installation procedure as is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except the Chrome driver hardcoding. Support at least Firefox and Edge as well both in code and instructions.
And rebase against the current master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolve the conflicts and split the prompt changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate the prompt changes into a different PR.
Selenium is now in the master branch |